108 results found.
Written
Lexicon,
Language Type:
Multilingual
Languages:
Chinese French German Italian Japanese Spanish Thai Vietnamese
Availability:
Freely Available
License:
Creative Commons Attribution-Share-Alike License 3.0
Size:
22.5 GByte Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:A Simple and Effective Approach to Robust Unsupervised Bilingual Dictionary Induction
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yanyang Li | Wiki word vectors | /N |
Documentation:
There is publicly available documentation in English.
Written
Treebank,
Language Type:
Monolingual
Languages:
Bengali Chinese English Filipino Hindi Indonesian Japanese Khmer Lao Malay Myanmar Thai Vietnamese
Availability:
Freely Available
License:
CreativeCommons
Size:
20106 sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Abhisek Chakrabarty | Asian Language Treebank Parallel Corpus | /N |
Documentation:
http://www2.nict.go.jp/astrec-att/member/mutiyama/ALT/ALT-Parallel-Corpus-20191206/README.txt
Written
Treebank,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
CC BY 4.0
Size:
1.15 MByte Production Status:
Existing-used
Use:
Discourse
-
Paper title:Chinese Paragraph-level Discourse Parsing with Global Backward and Local Reverse Reading
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Feng Jiang | MCDTB | /N |
Documentation:
MCDTB: A Macro-Level Chinese Discourse TreeBank
Written
Ontology,
Language Type:
Bilingual
Languages:
Chinese English
Availability:
From Data Center(s)
License:
Size:
39 MByte Production Status:
Existing-used
Use:
Knowledge Discovery/Representation
-
Paper title:End to End Chinese Lexical Fusion Recognition with Sememe Knowledge
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yijiang Liu | HowNet Knowledge Database | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
License:
Size:
711 MByte Production Status:
Existing-used
Use:
Corpus Creation/Annotation
-
Paper title:End to End Chinese Lexical Fusion Recognition with Sememe Knowledge
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yijiang Liu | SogouCA | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
Size:
55,566 patents OtherProduction Status:
Newly created-finished
Use:
Language Modelling
-
Paper title:Named Entity Recognition for Chinese biomedical patents
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Suzan Verberne | Chinese biomedical patents | /N |
Documentation:
https://github.com/yukihuyt/Chinese_biomed_patents_NER
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
Size:
5813 sentences Production Status:
Newly created-finished
Use:
Information Extraction, Information Retrieval
-
Paper title:Named Entity Recognition for Chinese biomedical patents
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Suzan Verberne | Chinese patents labelled with biomedical entities | /N |
Documentation:
https://github.com/yukihuyt/Chinese_biomed_patents_NER
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
OpenSource
Size:
400 MByte Production Status:
Existing-used
Use:
Question Answering
-
Paper title:Synonym Knowledge Enhanced Reader for Chinese Idiom Reading Comprehension
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Siyu Long | ChID | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
Afrikaans Albanian Amharic Arabic Aragonese Armenian Assamese Azerbaijani Basque Belarusian Bengali Bosnian Breton Bulgarian Burmese Catalan Central Khmer Chinese Croatian Czech Danish Dutch Dzongkha English Esperanto Estonian Finnish French Gaelic Galician Georgian German Greek Gujarati Hausa Hebrew Hindi Hungarian Icelandic Igbo Indonesian Irish Italian Japanese Kannada Kazakh Kinyarwanda Korean Kurdish Kyrgyz Latvian Limburgan Lithuanian Macedonian Malagasy Malay Malayalam Maltese Marathi Mongolian Nepali Northern Sami Norwegian Norwegian Bokmål Norwegian Nynorsk Occitan Oriya Panjabi Pashto Persian Polish Portuguese Romanian Russian Serbian Serbo-Croatian Sinhala Slovak Slovenian Spanish Swedish Tajik Tamil Tatar Telugu Thai Turkish Turkmen Uighur Ukrainian Urdu Uzbek Vietnamese Walloon Welsh Western Frisian Xhosa Yiddish Yoruba Zulu
Availability:
Freely Available
License:
Size:
55 million sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Biao Zhang | the open parallel corpus (OPUS) | /N |
Documentation:
None
Not Applicable
Contextualsed word embeddings,
Language Type:
Monolingual
Languages:
Ancient Arabic Basque Bokmål Bulgarian Catalan Chinese Church Croatian Czech Danish Dutch English Estonian Finnish French Galician German Greek Hebrew Hindi Hungarian Indonesian Irish Italian Japanese Korean Latin Latvian Norwegian Nynorsk Old Persian Polish Portuguese Romanian Russian Simplified Chinese Slavonic Slovak Slovene Spanish Swedish Turkish Ukrainian Urdu Uyghur Vietnamese
Availability:
Freely Available
License:
none
Size:
18.4 GByte Production Status:
Existing-used
Use:
Parsing and Tagging
-
Paper title:Treebank Embedding Vectors for Out-of-domain Dependency Parsing
-
Paper track:Short/Syntax: Tagging, Chunking and Parsing
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Joachim Wagner | Elmo For Many Languages | /N |
Documentation:
https://www.aclweb.org/anthology/K18-2005/




